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ABSTRACT 

We propose a general framework to scrutinize the performance of semi-analytic codes 
of galaxy formation. The approach is based on the analysis of the outputs from the 
model after a series of perturbations in the input parameters controlling the baryonic 
physics. The perturbations are chosen in a way that they do not change the results in 
the luminosity function or mass function of the galaxy population. 

We apply this approach on a particular semi-analytic model called GallCS. We 
chose to perturb the parameters controlling the efficiency of star formation and the 
efficiency of supernova feedback. We keep track of the baryonic and observable prop- 
erties of the central galaxies in a sample of dark matter halos with masses ranging 
from 10 10 M to 10 13 M©. 

We find very different responses depending on the halo mass. For small dark matter 
halos its central galaxy responds in a highly predictable way to small perturbation 
in the star formation and feedback efficiency. For massive dark matter halos, minor 
perturbations in the input parameters can induce large fluctuations on the properties 
of its central galaxy, at least ~ 0.1 in B — V color or ~ 0.5 mag in U or r filter, in 
a seemingly random fashion. We quantify this behavior through an objective scalar 
function we call predictability. 

We argue that finding the origin of this behavior needs additional information 
from other approximations and different semi-analytic codes. Furthermore, the imple- 
mentation of an scalar objective function, such as the predictability, opens the door to 
quantitative benchmarking of semi-analytic codes based on its numerical performance. 

Key words: methods: iV-body simulations - galaxies:formation - galaxies:evolution 



1 INTRODUCTION 

Hierarchical aggregation seems to be at the heart of galaxy 
evolution. In a cold dark matter universe, as depicted by nu- 
merical simulations, its structure grows through subsequent 
mergers and zero fragmentations. The growth and evolu- 
tion of galaxies, which are thought to use dark matter as 
scaffolding, is channeled through this hierar chical aggrega- 
tion, at least for the most massive structures (|Springel et al.l 
l2006h . 

Notwithstanding all the complexity in the process of 
galaxy formation and evolution, galaxies still are the most 
basic population unit in the description of large scale struc- 
ture in the Universe. And still nowadays much work is being 
invested in galaxy formation to disentangle the influence of 
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the hierarchical context setup by dark matter from the sec- 
ular baryonic processes on small scales. 

Tackling the problem theoretically implies numerical 
experiments following large structure dynamics and, at the 
same time, a description of baryonic processes such hydro- 
dynamics and radiative cooling. This is still very challenging 
in the usual numerical approach that discretizes space and 
time, and tr y to solve a relevant set of equations to capture 
the physics (lAbel et al.ll2002l ; iGottlober et alj|2006h . From 
the computational point of view it involves achieving an ef- 
fective resolution spann ing at least 5 orders of magnitude in 
mass, length and time (jNorman et all 120071 ). 

To overcome this barrier the semi-analytic model (here- 
after SAM) approach proposes to describe first the non- 
linear clustering of dark matter on large scales, and describe 
later the small scale baryonic physics through analytical pre- 
scriptions. The connection between the two scales is pro- 
vided through the dark matter halo, which is the most basic 
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unit of non-linear dark matter structure (see iBaughl (|2006l ) 
and references therein). 

The non linear clustering of dark matter is de- 
scribed through a merger tree, representing the merg- 
ing history of a given dark matter halo. The construc- 
tion methods for merger trees can vary, ranging from 
Monte-Carlo realizations based on theoretical estimates 
l|Somerville fc KolattJ fl999h to the numerical based on N- 
body simulations l|Croton et al.l 120061 ) , including also hy- 
brid approaches mix ing numerical and analytical techniques 
l|Taffoni et al.ll2002h . 

The different analytic implementations of baryonic pro- 
cess span a wide range of philosophical approaches, physi- 
cal concepts and numerical implementations. Most of them, 
nonetheless, constructed from observed correlations in our 
local patch of Universe. 

Regardless of the details of these models, what is gen- 
eral to all of them is the underlying merger trees structure 
complementary with analytic recipes describing the growth 
of baryonic structure inside the merger trees. The ignorance 
respect to the physics included in the analytic recipes is 
usually represented by scalars, which in most of the cases 
represent efficiencies of physical processes. This implies that 
a given realization of a semi-analytic run is completely deter- 
mined by the dark matter input and the baryonic parameters 
in the simulation. 

Most of the work during the last decade was invested 
in adding and exploring the effect of these parameters on 
average quantities, especially the luminosity function. The 
generic parameters to be used have been more or less settled, 
and most of the models have achieved a good level of internal 
consistency by reproducing some key observational features. 
The confidence on the consistency that can be achieved, and 
the ease to perform a semi-analytic run, have empowered 
the modelers to select subsamples and make studies about 
the most massive galaxies or correlations among p opulations 
l|De Lucia fc BlaizodlioOTl ; lHavashi fc Whitell2007l ). 

As the complexity and interest in semi analytic tech- 
niques grow, two relevant issues must be addressed in more 
detail. First, the issue of error propagation from the incerti- 
tude in the input parameters, a factor that might be impor- 
tant in a hierarchical Universe, where the amplification of 
small initial errors might be important for the most massive 
and hierarchical objects. Second, the development of objec- 
tive ways to compare different types of complexity in semi 
analytic models. This could allow, for instance, the imple- 
mentation of simple tests with an objective scalar function 
to measure the model performance. 

The objective of this paper is two-fold: 

• Propose a methodology to weigh the role of secular 
baryonic processes in the context of SAMs. 

• Propose an objective scalar function that captures the 
biases and general behavior of semi-analytic models regard- 
less of its detailed implementation. 

These two objectives are a result of the same pertur- 
bative approach we advocate in this paper. This approach 
is based on the fact that the only objective information we 
have to describe the results of a semi-analytic run are the in- 
put parameters of the model. In the perturbative approach, 
we perform semi-analytic runs in the neighborhood of some 



scalar parameters. This will allow us, as we will show, to get 
an idea about the limits of our semi-analytic model. 

This paper is structured as follows. In Section [2] we de- 
scribe the structure common to all SAMs and from that 
point we introduce the concept of perturbations in a semi- 
analytic model. In Section 3 we introduce the setup for the 
perturbation experiment of our SAM. We describe in Sec- 
tion 4 the two most relevant qualitative features of the ex- 
periment results. We select one of this qualitative results to 
make a detailed quantitative analysis with three different 
indices, these results are shown in Section 5. We discuss our 
results in Section 6. 



2 SEMI-ANALYTIC MODELS 
2.1 Common features 

Semi-analytic models exploit the fact that there are two very 
different physical scales involved in the process of galaxy for- 
mation and evolution. On large scales dark matter and grav- 
ity are dominant, while on smaller scales complex radiative 
processes a re central to the development of galactic sized 
structures (|Somerville fc Prim ack 19991; Hatton et al. Il2003l ; 



iBell et al.ll2003l ; ICroton et al.ll2006l ; iMonaco et al.ll2007l ~ 

Inside semi-analytic models all the non-linear dark mat- 
ter dynamic is described through the merger tree, which rep- 
resent the process of successive mergers building a dark mat- 
ter halo. On top of this merger tree, all the complex baryonic 
physics are implemented through analytic prescriptions de- 
rived in most part from observations. 

The baryonic processes, in the end, are controlled by 
a set of scalars, which represent most of the time either 
an efficiency or a threshold value. From the pure functional 
point of view, all the baryonic properties B of a dark matter 
halo TL are a function of its merger tree T and the set of 
scalar parameters controlling the model {Ai . . . Ajv}. 



B = S(T,Ai, 



, Ajv). 



(1) 



Furthermore, during a semi-analytic run, the set of pa- 
rameter {Ai ... Ajv} is fixed to be the same for all the halos, 
all the time. Thus, the trees and the scalar values completely 
define the outputs. 

From the perspective of disentangling the role of dif- 
ferent physical elements in the process of galaxy formation, 
the approach commonly followed is the exploration through 
a set of different values for the {Ai} parameters, taking as a 
gauge the reproduction of the luminosity function of diverse 
galaxy populations. This coarse exploration of parameter 
space have been done until a minimum internal consistency 
is achieved, a decision based on the success of reproducing 
a wide set of observational constraints. 

Nowadays more and more results of semi-analytic mod- 
els are being used in the predictive sense, selecting subsam- 
ples of galaxies, trying to explain or predict astrophysical 
quantit ies of interest based on the results of a semi-analyti c 
model (|Pe Lucia fc Blaizotl 120071 ; lHavashi fc White! 120071 '). 
This have been done without an explicit treatment of the 
potential biases and complications introduced by the semi- 
analytic model itself. 

We are interested in understanding in greater detail the 
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behavior and limits of semi-analytic methods, regardless of 
its detailed implementation across different codes. Our at- 
tempt to deal with the complexity across semi-analytic mod- 
els, is based on the basic conceptual approach in Eq[T] which 
is the only structural information we have about how semi- 
analytic models work. 

2.2 Perturbing the model 

We intend to measure the effect of perturbations in the 
model 

B' =B{T,Xi+6\ 1 ,...,X N + 6\ N ), (2) 

where the magnitudes of the perturbations SXi are con- 
strained in such a way that its effect is not significant on the 
mean population quantities such as the luminosity function, 
meaning that we are not breaking the broad consistency of 
the model. 

The main objective is measuring the consequences of 
these perturbation and to use them as a gauge of the model's 
numerical performance, but also to see how and where are 
going to emerge the consequences of the perturbations. 

If we intend to explore the neighborhood of a given 
set of parameters {Xi} by making runs around {±8Xi}, this 
implies performing 2^ different runs where N is the total 
number of scalar parameters controlling the model. If we 
want to explore the neighborhood around m different values 
for each Xi, the number of runs becomes m . 

The number of free parameters in SAMs can be at least 
6. Which means that we should deal at least with 2 6 = 
64 different runs to minimally explore the neighborhood of 
{Xi}. Performing all these runs over a cosmological volume 
is unfeasible and perhaps not very useful. 

The approach we decided to follow in this paper uses 
two simplifications. First, we only perform the simulations 
over a subset of dark matter halos selected at random in a 
box of cosmological size. Second, we explore only the neigh- 
borhood of two scalar parameters. 

The size of the halo subsample is about 1% of the total 
number of halos in the simulated cosmological box, and the 
parameters we will explore are the star formation efficiency 
a and the supernova feedback efficiency e. 



3 EXPERIMENT 

The semi-analytic model we use in t his paper is a slightl y 
modified version of that presented in lHatton et alj {2003). 
As we do not want to compare our results with observations 
or another models, we briefly review only the elements rele- 
vant for our discussion: the dark matter description and the 
star formation and supernovae feedback implementations. 

3.1 Dark Matter 

The dark matter simulation was performed using cosmolog- 
ical parameters compat ible with a 1st year WMAP cosmol- 
ogy (|Spergel et al.ll2003h (fi m , Ov, cr 8 , h) = (0.30, 0.70, 0.92, 
0.70), where the parameters stand for the density of mat- 
ter, density of dark energy, amplitude of the mass density 
fluctuations and the Hubble constant in units of 100 km 
s _1 Mpc _1 . . The simulation volume is a cubic box of side 



100fo -1 Mpc with 512 3 dark matter particles, which sets the 
mass of each particle to 5.16 x 10 s ft. -1 Mq. The simulation 
was evolved from an initial redshift z — 32 down to redshift 
2 = 0, keeping the particle data for 100 time-steps. 

For each recorded timeste p build a halo cata logue using 
a friends-of- friends algorithm l|Davis et al.lll985T ) with link- 
ing length b — 0.2. Only the groups with 20 or more bound 
particles are identified as halos. This sets the minimal mass 
for a dark matter halo to 1.03 x 10 10 h~ x M . These halo cat- 
alogues provide the input for the construction of the merger 
trees used as input for the semi-analytic model. 

3.2 Star Formation and SN feedback 

The star formation rate is set proportional proportional to 
the mass of cold gas, and without any other characteris- 
tic time scale we impose that the rate at which the gas is 
consumed to form stars is given by the dynamical time of 
the disc. This is motivated by th e observational c orrelations 
observed by Kennicutt-Schmidt (|Kennicuttlll998l ). 

Hence, in our model, the global star formation rate 
on galactic scales is given by the following equation 

tt* = a^, (3) 

"dyn 

where a is an efficiency parameter, and tdyn is the dy- 
namical timescale of the component we are interested (disc 
or bulge). For tdyn we use the time taken for material at 
the half-mass radius to reach either the opposite side of the 
galaxy (disc) or its center (bulge), and is given by: 

tdyn = r 1/2 X TYV' 1 , (4) 

where v is a characteristic velocity in the galaxy compo- 
nent and rx/2 is the half mass radius. For discs the velocity 
v is equal to the circular velocity of the disc where the mate- 
rial is assumed to have purely circular orbits. In the case of 
spheroidal components v is the velocity dispersion, where we 
assume the matter in the component has only radial orbits. 

The star formation is triggered if the column density 
of the gas is gre ater that a given threshold constrained by 
the observations iKennicuttl (| 199ST ) . By simplicity we assume 
that the initial mass function is universal at all redshift and 
follows a Kennicutt initial mass function. 

Once stars are formed, the massive stars will explode 
inside the galaxies ejecting hot gas and metals in the in- 
terstellar medium. The simple model that we use f or thi s 
phenomenon is given by the implementation of ISilkl l|200lf ). 
where the rate of gas mass loss is written assuming an sta- 
tionary model 

= ¥* x »/sj\rAros*r X (1 + L) x (1 - e _B ), (5) 

where is the star formation rate, rjsN is the number 
of supernovae per unit mass of formed stars (fixed number 
function of the IMF), AmsN is mean mass loss of one su- 
pernova (~ 10 M©) and (1 + L) is defined as 

1+L _ £ m £fiii (6) 
rrig a i 
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where m gaa and m ga i are the gas and total mass of 
the galaxy component and the parameter e regulates the 
efficiency of the feedback. We also define the porosity of the 
galaxy component as 

R f^v»(mX n t ( 7) 

\m gas J \ a J 

where a is a typical dispersion velocity in the interstellar 
medium (in km/s), which we fix to lOkm/s for disks and to 
the velocity dispersion for the spheroidal components. The 
parameter a is the star formation efficiency. In this model, 
usually the ejected amount of gas is of the same order of the 
mass of formed stars. 

3.3 Experiment Setup 

From the detected halos at redshift zero we select at random 
nearly 600 of them, which corresponds to about 1% of the 
total number of halos in the box. For each halo we have its 
corresponding merger history, and we are able to run our 
galaxy formation code on every individual merger 

trefi 

For each halo we make 320 runs varying two parameters, 
the star formation efficiency a and the supernova feedback 
efficiency e. The first parameter, a, is sampled at 16 points 
between 0.018 and 0.022, and the second, e, is sampled at 
20 points between 0.18 and 0.20. From this we define a 2- 
dimensional representation to keep track of each run. We 
setup two coordinate plane. Along the first dimension (x- 
axis) we vary the star formation efficiency a, and along the 
second dimension (y-axis) we vary the supernova feedback 
efficiency e. This defines the a-e plane. 

For each run (for a given halo and for given point in 
the a-e plane) we select the central galaxy in the halo, 
which is the only galaxy with a clear identity in a hierarchi- 
cal paradigm. For each central galaxy we track six physical 
properties: total mass (gas and stars) , mass of stars , bolo- 
metric luminosity, absolute magnitude in the SDSSy, SDSS r 
filters and the B — V color. We will refer to the values of 
a given galactic property, for a given galaxy, over the a-e 
plane as a landscape. 



4 QUALITATIVE RESULTS 

We present in Figfj] the landscapes for the total galactic 
mass, stellar mass and SDSSu absolute magnitude for two 
galaxies, each column representing one galaxy. Qualitatively 
speaking, we can spot a striking difference in this figure. 

The left column in Fig[T] presents the results for a cen- 
tral galaxy in halo of mass ~ IO^Mq. Its growth process 
have been dominated by what we call smooth accretion, 
meaning that at our working resolution this halo has not 
suffered any major merger. The predicted properties vary 
smoothly over the a-e plane. 

On the right column in the same figure, we show the 
same properties for the central galaxy in a halo of mass 
~ 10 13 M Q . In this case the values over the landscape do not 

1 Actually because of technical reasons the code is run over a 
bundle of merger trees. 



follow any pattern. The biggest difference with respect to the 
previous case is that the halo growth cannot be described 
by pure accretion, but through repeated mergers. 

A second qualitative feature is the emerging bimodality 
for some landscapes. It is visible in the upper right panel in 
Fig[T] where it seems that the values over the landscape are 
oscillating back and forth between two planes. To illustrate 
better this effect we have constructed the histograms for two 
kind of landscapes (SDSS r and SDSSy magnitudes) for four 
different halo masses. The results are shown in Fig[2] which 
shows how the landscapes are not necessarily unimodal. By 
visual inspection of half of the landscapes for the total mass, 
bolometric luminosity and SDSS r , we can report that the 
non-unimodality is a recurrent landscape feature. 

For the rest of the paper we will be concerned with a 
quantification of the first result, which showed an apparent 
randomness for the central galaxies in strongly hierarchical 
halos. We will use three different indicators. 

First, we will define a scalar function called predictabil- 
ity, P, for a given ga lactic property over the a-e plane 
l|Pascual fc Levinlll999l ). The predictability will be almost 
one for the low mass case, and zero (or even negative) for 
the case of the massive halo. 

The second method of quantification is based on the 
predictability and the variance over the landscapes. We will 
calculate a predictability-weighted variance, which is in- 
tended to represent a quantitative estimation of the vari- 
ations we can expect in a galactic property after performing 
a minimal perturbation Sa-Se. 

The last method of quantification compares the vari- 
ance over the landscapes with the variance over a subsam- 
ple of galaxies hosted by halos of similar mass inside the full 
cosmological box. 



5 QUANTITATIVE RESULTS 
5.1 Predictability 

We present the first part of the qualitatively results using 
a scalar function we call predictability. First, we sketch out 
the general idea behind its definition. 

We place ourselves on the a-e plane, and we want to 
predict the value of some galactic quantity at the point we 
are standing, we also intend to use the information available 
in the neighborhood. We have the values of the quantity we 
want to measure for the four nearest neighbors in the a -e 
plane. We make a guess for that value by averaging these 
values, and at the same time we perform the measurement. 

We have now two different values at the point in the 
a -e plane, one is predicted and the other is measured. If 
the squared difference between these two values is small for 
each point in the plane, we can be sure that we are over a 
smooth landscape. If the squared differences over the plane 
are big, the landscape is not so smooth. The predictability 
is a measure based on these squared differences. 

In practice, we use a discretization of the plane a-e, and 
we construct two different scalar fields over that plane. The 
first corresponds to the field measured in the numerical runs, 
noted L. The second is a predicted version, noted L' . 

The values of L'(pn,ej) are calculated from the neigh- 
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Figure 1. Results for the sampling of the star formation efficiency a and the supernova feedback efficiency e. Each panel represents the 
results of some galactic property in the central galaxy of a dark matter halo. On the left, the results correspond to the central galaxy 
in a halo of mass ~ 10 10 M©. On the right, to a more massive halo of ~ 10 12 M©. For each galaxy we show the results concerning the 
total galaxy mass (upper panels), the stellar mass (middle panels) and the magnitude in the SDSSj/ band (lower panels). Every small 
square in each panel shows the result at redshift 2 = for the run with the corresponding value of a and e. The results for the low mass 
halo are predictable, for the high mass halo they are almost random. Note that for instance in the case of the SDSSjy filter the values 
fluctuate over a range of ~ 1.4 mag. The bulk of our paper is devoted to the quantification of this behavior as a function of halo mass. 
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Figure 2. Histograms (normalized to add up to unity) of the values over four randomly selected the landscapes. From left to right the 
mass of the host dark matter halo increases. Upper row shows the results for the SDSSu magnitudes. Lower row: SDSSr magnitudes. 
This illustrates another qualitative feature of the landscapes, namely that sometimes they are bimodal, for instance in the upper-left 
panel. We do not try to quantify this behavior in the paper. 



boring points in L(a,e), as follows 

L'{cti,ej) = -[L(a i+ i,ej)+ 

L(an-i,ej) + L(oti, ej+i) + L(on, (8) 
We construct now the following quantity 

Q 2 = ^I>Wi)-£Ke;)] 2 , (9) 

id 

where N is the total number of points in the plane a- 
e. This quantity help us to define the predictability 




(10) 



with cr 2 as the variance of the landscape 

<?=^Y^L{* i ,e i )-L\*. (11) 

The predictability is bounded to P sC 1. A value of 
P ~ 1 implies that the landscape is very smooth, while for 
values P sj the changes from neighboring sites can be high. 

We now turn to the results of the Fig(3] where we plot 
the predictability as a function of the logarithm of the mass 
of the host halo, for all the galaxies in our study. Starting 
with the total galaxy mass (stars and the gas) we can see 
that the galaxies have high predictability, P > 0.9, in most 
of the cases. The situation is quite different for the stellar 
mass and the bolometric luminosity. In these cases the pre- 
dictability ranges almost evenly between < P < 1, and 
we start seeing some fraction of points with negative pre- 
dictability. In the case of the (B — V) colors and SDSSiy, 
SDSSr magnitudes we are in a totally different ballpark as 
most of the landscapes have negative predictability, with a 
few points over the range < P < 1. 



The conclusion after these results is that we spot a land- 
scape with a very predictability P < 0.9 we can be sure that 
the galaxy is sitting in halo less massive than 3 x 10 M@. 
In the same vein, when picking the central galaxy in a halo 
of mass > 10 12 Mq, surely the predictability is going to be 
lower P < 0.9, or negative in the case of (B — V) colors and 
SDSSr, SDSSiy magnitudes. 

5.2 P- Weighted Landscape Variance 

We explore now the second way of quantification of our re- 
sults. It is based on the landscape variance a at over the the 
320 points in the a-e plane (Eq lllJ) and the predictability P 
(Eq[l0j. 

We want to weight the landscape variance by the infor- 
mation obtained through the predictability P. Performing a 
normalization in this way we can have an idea about how 
much should be expected to vary a given galactic property 
after performing a perturbation (Sa,5e). 

Using the variance a at alone can be misleading in the 
case of a high predictability landscape, because it could over- 
estimate the variation of performing a {8a, <5e) perturbation. 

We propose then, the P-weighted variance 

Ump = (1 - e P_1 ) x <r ae , (12) 

which has the property of being bound between ^ 
<JatP ^ <J at for the possible values of the predictability 
-co < P < 1. 

The general trend (Fig(4j) shows a growth in the P- 
weighted predictability with halo mass, consistent with the 
fact that the largest values for the predictability come along 
with large values for the landscape variance. The results 
for the total mass landscape and the bolometric luminosity 
stand apart, as this mass trend is less clear than for the 
other galactic properties. 
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Figure 3. Results for the predictability of different properties. Each point represents a central galaxy. From left to right, top to bottom: 
total mass of the galaxy, stellar mass, bolometric luminosity, (B — V) color, SDSSjy and SDSSr magnitudes. The most dependent 
quantities on the star formation history (colors and magnitudes) are observed in general to have a very low predictability. The most 
predictable quantity respect to variations in the a-e plane is the total galactic mass. 



The values can be used to measure the variation one 
can expect from a 1% perturbation in (a,e). If we recall 
that this is calculated from a 1-cr variance, the value of a af: p 
that could effectively bracket the fluctuations over the a-e 
plane would be three times that value. It means that for 
the most massive halos of mass ~ 10 13 Mq one can expect, 
at least, a variation of ~ 0.5 in the SDSSjj magnitudes or 
~ 0.1 for the (B — V) colors after 1% variation in (a, e). The 
variation in the total and stellar mass could achieve at least 
~ 0.1 dex. 



ratio is of the same order of magnitude (c^aZo/cae < 3, 
log 10 (<7 halo/ 'octi) < 0.5) in most of the cases. 

This suggests that for central galaxies in low mass halos 
(where the predictability tends to be high) the variance over 
its properties is dominated mostly by the possible mass of 
the host halo. While for high masses (where the predictabil- 
ity tends to be low) the variance for the galactic properties 
can be equally important if we vary the halo mass or if we 
make a variation in the star formation and feedback efficien- 
cies. 



5.3 Cosmological Variance 

Now we compare the landscape variance a at with the scale 
imposed by the cosmological context. We select from the 
simulated cosmological box (with a and e in the center of 
the a-e plane) all the dark matter halos with the same mass 
(within 1%) as the parent halo of the galaxy under study. 
From this halo population, we calculate the variance ahaio of 
the galactic properties of our interest for its central galaxies. 

We show in Fig[5]the logarithm of the ratio of the two 
variances (ai la io/ 'o" Qe ) as a function of halo mass. For all the 
cases (except the total galactic mass) we observe the the 
ratio Uhaio/oat diminishes as the halo mass grows. 

The case for the B — V color and the SDSS r and SDSSrj 
band magnitude seems special. For high masses the variance 



6 DISCUSSION 

We have used iV-body simulations and a semi-analytic 
model of galaxy formation to explore the consequences of 
small perturbations in the input parameters of the semi- 
analytic model. Specifically we varied the scalar parameters 
regulating the star formation a and supernovae feedback 
efficiency e. We followed some physical properties for 600 
central to gauge the effect of the perturbations. 

This experiment was motivated both by the interest of 
performing a description of a semi analytic models, making 
abstraction of the details in the model , and test the signif- 
icance of this approach to infer the distinctive footprint of 
the different baryonic processes. 

We find that depending on the halo mass, there are two 
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Figure 4. Predictability-weighted variance (<7 Qe p) over the a-e landscape (Eg. \12l . This quantity can be used as an estimation of the 
expected change in a given galactic property when a variation (8a, 8e) is performed on the input parameters. In the case of magnitudes 
it can be easily as high as 0.5 for a 1% variation in (a, 8). A trend seems to indicate that this weighted variance increases with the dark 
matter halo mass. 



kinds of different of behavior respect to the change in the 
input parameters. There is a smooth variation of the galac- 
tic properties for low mass halos, and a seemingly random 
unpredictable behavior limited to high mass halos. 

We have quantified this behavior through an objective 
scalar function called predictability, P, defined in Ea lf 01 Val- 
ues P ~ 1 mean a rather smooth and predictable response, 
while P ^ point a more unpredictable behavior. The high- 
est predictability is found only in low mass halos while high 
mass halos present almost always negative values of P. This 
notion of predictability depends on the property we are look- 
ing at. In our case the total galactic mass and the total stellar 
mass showed the general trend of high predictability on the 
a-e plane (upper panels in Fig[3J). While the (B — V) color 
and the SDSSr/ and SDSS r bands showed almost always 
negative predictabilities (lower panel in Fig(3]). 

Then, we computed the variance a at over the a-e, a a( 
plane and weighted it by the predictability over the same 
landscape. This helped us to estimate the possible variation 
in a given property after performing a change 8a — Se in the 
input parameters. Using this measure we found that for high 
mass halos on should expect rather large variations in the 
galactic properties from a small perturbation in the baryonic 
parameters (Fig(4j). 

In order to give a scale to the landscape fluctuations 
for a given galaxy, we compare the landscape variance with 
the variance in a subset of central galaxies of halos taken 



from the whole cosmological box simulation. The halos are 
selected to have similar mass as the parent halo of the galax- 
ies we are studying . The general trend showed that at higher 
halo masses the variance coming from the modulation of the 
a and e parameters is on the same order of magnitude as the 
variation on the galaxy properties over the whole box. The 
opposite trend is only found for low halo masses (Fig[5}. 

In this particular case of perturbation related to star 
formation and supernova feedback, it seems that the quan- 
tities that exhibit a dependence on the full star formation 
history (magnitudes and colours) are the most sensitive to 
variations of the a — e parameters. For instance, most of the 
trends we found are not found for the total galaxy mass. 

In general, all this evidence seems to point towards a 
picture where the central galaxies hosted in massive halos, 
which have grown mainly through mergers, are the most 
sensitive to small variations of the baryonic parameters in 
a way that is comparable of doing a significant variation on 
the mass of the host halo. 

From these results one can expect that if the variation 
of the others scalar parameters in the model is performed, 
the predictability P, landscape variance a\ i and P-weighted 
landscape variance a^p should be higher than the values 
quoted in this paper. It is very unlikely that the variations 
of other parameters could cancel out exactly the influence 
of the efficiencies a e. 

There is a hierarchy of causes for this behavior that 
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Figure 5. Logarithm of the ratio of the variance calculated over a subset of halos in the simulated cosmological box (cr ha[o ) to the 
landscape variance cr a e- Except for the total galaxy mass, the general trend seems to indicate that the ratio is lower for higher halo 
masses. For magnitudes and colours, a variation in the a,e input parameters can be as important as a 1% fluctuation in the halo mass. 
For lower masses, the halo mass fluctuations arc dominant. 



must be explored. First of all, might this be a signature 
of the hierarchical build of galaxies? In such a picture it is 
easy to imagine that mild perturbations at early times might 
add up to finally yield very different values for very similar 
initial conditions. This would account for the relatively large 
values of the variance over the a-e plane compared to the 
intrinsic variances over the whole population of similar halos 
in the cosmological volume. But probably not for the low 
predictability values. 

Could this be an artifact coming from the semi-analytic 
models of galaxy formation? In these models, generally the 
distinction between what is to be considered as the central 
galaxy depend on which galaxy is the most massive. This is 
ambiguous when various galaxies inside a dark matter halo 
have similar masses, in that case the selection of the central 
galaxy might be subject to noise. This could explain in part 
the seemingly random landscapes for high mass haloes. 

On the last level of the hierarchy, could this be com- 
ing from our code? This is impossible to confirm without 
performing the same kind of experiment with another fully 
fledged semi-analytic model. Which take us to the issue of 
comparison between semi-analytic models of galaxy forma- 
tion. The predictability, as a meaningful scalar objective 
function, opens the possibility to measure the biases from 
different semi-analytic codes. This could allow the compar- 
ison of different codes based on its numerical performance, 
going beyond the rather ill-posed strategy of comparison 



based on astrophysical performance, i.e. reproducing obser- 
vations. 

Finally, the small perturbations we made on the scalar 
parameters were constructed to not have any effect on the 
mean properties of the galaxies such as the luminosity func- 
tion. It means that formally the galaxies we have produced 
at every perturbation are consistent with the overall galaxy 
population. 

As a consequence of all this, studies making use of se- 
lected subpopulations from a wider population generated 
using semi-analytic models, should bear in mind that this 
smaller population might not be unique. The dispersion on 
this subsample of galaxies, coming from the perturbations 
that can be induced on every parameter in the model, should 
be explicitly stated. Including that dispersion (in the form 
of error bars, for instance) seems a necessary condition to 
make a fair use of semi-analytic models, acknowledging in 
an explicit manner its limitations on predictability. 
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